Gating Recurrent Enhanced Memory Neural Networks on Language Identification
نویسندگان
چکیده
This paper proposes a novel memory neural network structure, namely gating recurrent enhanced memory network (GREMN), to model long-range dependency in temporal series on language identification (LID) task at the acoustic frame level. The proposed GREMN is a stacking gating recurrent neural network (RNN) equipped with a learnable enhanced memory block near the classifier. It aims at capturing the long-span history and certain future contextual information of the sequential input. In addition, two optimization strategies of coherent SortaGrad-like training mechanism and a hard sample score acquisition approach are proposed. The proposed optimization policies drastically boost this memory network based LID system, especially on the large disparity training materials. It is confirmed by the experimental results that the proposed GREMN possesses strong ability of sequential modeling and generalization, where about 5% relative equal error rate (EER) reduction is obtained comparing with the approximate-sized gating RNNs and 38.5% performance improvements is observed compared to conventional i-Vector based LID system.
منابع مشابه
End-to-End Language Identification Using Attention-Based Recurrent Neural Networks
This paper proposes a novel attention-based recurrent neural network (RNN) to build an end-to-end automatic language identification (LID) system. Inspired by the success of attention mechanism on a range of sequence-to-sequence tasks, this work introduces the attention mechanism with long short term memory (LSTM) encoder to the sequence-to-tag LID task. This unified architecture extends the end...
متن کاملSimplified Long Short-term Memory Recurrent Neural Networks: part III
This is part III of three-part work. In parts I and II, we have presented eight variants for simplified Long Short Term Memory (LSTM) recurrent neural networks (RNNs). It is noted that fast computation, specially in constrained computing resources, are an important factor in processing big timesequence data. In this part III paper, we present and evaluate two new LSTM model variants which drama...
متن کاملMulti-Language Identification Using Convolutional Recurrent Neural Network
Language Identification, being an important aspect of Automatic Speaker Recognition has had many changes and new approaches to ameliorate performance over the last decade. We compare the performance of using audio spectrum in the log scale and using Polyphonic sound sequences from raw audio samples to train the neural network and to classify speech as either English or Spanish. To achieve this,...
متن کاملGated Feedback Recurrent Neural Networks
In this work, we propose a novel recurrent neural network (RNN) architecture. The proposed RNN, gated-feedback RNN (GF-RNN), extends the existing approach of stacking multiple recurrent layers by allowing and controlling signals flowing from upper recurrent layers to lower layers using a global gating unit for each pair of layers. The recurrent signals exchanged between layers are gated adaptiv...
متن کاملUnderstanding Gating Operations in Recurrent Neural Networks through Opinion Expression Extraction
Extracting opinion expressions from text is an essential task of sentiment analysis, which is usually treated as one of the word-level sequence labeling problems. In such problems, compositional models with multiplicative gating operations provide efficient ways to encode the contexts, as well as to choose critical information. Thus, in this paper, we adopt Long Short-Term Memory (LSTM) recurre...
متن کامل